Search CORE

20 research outputs found

Graph Convolutions For Teams Of Robots

Author: Khan Arbaaz
Publication venue: ScholarlyCommons
Publication date: 01/01/2021
Field of study

In many applications in robotics, there exist teams of robots operating in dynamic environments requiring the design of complex communication and control schemes. The problem is made easier if one assumes the presence of an oracle that has instantaneous access to states of all entities in the environment and can communicate simultaneously without any loss. However, such an assumption is unrealistic especially when there exist a large number of robots. More specifically, we are interested in decentralized control policies for teams of robots using only local communication and sensory information to achieve high level team objectives. We first make the case for using distributed reinforcement learning to learn local behaviours by optimizing for a sparse team wide reward as opposed to existing model based methods. A central caveat of learning policies using model free reinforcement learning is the lack of scalability. To achieve large scale scalable results, we introduce a novel paradigm where the policies are parametrized by graph convolutions. Additionally, we also develop new methodologies to train these policies and derive technical insights into their behaviors. Building upon these, we design perception action loops for teams of robots that rely only on noisy visual sensors, a learned history state and local information from nearby robots to achieve complex team wide-objectives. We demonstrate the effectiveness of our methods on several large scale multi-robot tasks

ScholarlyCommons@Penn

End-to-End Navigation in Unknown Environments using Neural Networks

Author: Atanasov Nikolay
Karydis Konstantinos
Khan Arbaaz
Kumar Vijay
Lee Daniel D.
Zhang Clark
Publication venue
Publication date: 23/07/2017
Field of study

We investigate how a neural network can learn perception actions loops for navigation in unknown environments. Specifically, we consider how to learn to navigate in environments populated with cul-de-sacs that represent convex local minima that the robot could fall into instead of finding a set of feasible actions that take it to the goal. Traditional methods rely on maintaining a global map to solve the problem of over coming a long cul-de-sac. However, due to errors induced from local and global drift, it is highly challenging to maintain such a map for long periods of time. One way to mitigate this problem is by using learning techniques that do not rely on hand engineered map representations and instead output appropriate control policies directly from their sensory input. We first demonstrate that such a problem cannot be solved directly by deep reinforcement learning due to the sparse reward structure of the environment. Further, we demonstrate that deep supervised learning also cannot be used directly to solve this problem. We then investigate network models that offer a combination of reinforcement learning and supervised learning and highlight the significance of adding fully differentiable memory units to such networks. We evaluate our networks on their ability to generalize to new environments and show that adding memory to such networks offers huge jumps in performanceComment: Workshop on Learning Perception and Control for Autonomous Flight: Safety, Memory and Efficiency, Robotics Science and Systems 201

arXiv.org e-Print Archive

eScholarship - University of California

Memory Augmented Control Networks

Author: Atanasov Nikolay
Karydis Konstantinos
Khan Arbaaz
Kumar Vijay
Lee Daniel D.
Zhang Clark
Publication venue
Publication date: 27/12/2017
Field of study

Planning problems in partially observable environments cannot be solved directly with convolutional networks and require some form of memory. But, even memory networks with sophisticated addressing schemes are unable to learn intelligent reasoning satisfactorily due to the complexity of simultaneously learning to access memory and plan. To mitigate these challenges we introduce the Memory Augmented Control Network (MACN). The proposed network architecture consists of three main parts. The first part uses convolutions to extract features and the second part uses a neural network-based planning module to pre-plan in the environment. The third part uses a network controller that learns to store those specific instances of past information that are necessary for planning. The performance of the network is evaluated in discrete grid world environments for path planning in the presence of simple and complex obstacles. We show that our network learns to plan and can generalize to new environments

arXiv.org e-Print Archive

eScholarship - University of California

Neural Network Memory Architectures for Autonomous Robot Navigation

Author: Atanasov Nikolay
Chen Steven W
Karydis Konstantinos
Khan Arbaaz
Kumar Vijay
Lee Daniel D.
Publication venue
Publication date: 22/05/2017
Field of study

This paper highlights the significance of including memory structures in neural networks when the latter are used to learn perception-action loops for autonomous robot navigation. Traditional navigation approaches rely on global maps of the environment to overcome cul-de-sacs and plan feasible motions. Yet, maintaining an accurate global map may be challenging in real-world settings. A possible way to mitigate this limitation is to use learning techniques that forgo hand-engineered map representations and infer appropriate control responses directly from sensed information. An important but unexplored aspect of such approaches is the effect of memory on their performance. This work is a first thorough study of memory structures for deep-neural-network-based robot navigation, and offers novel tools to train such networks from supervision and quantify their ability to generalize to unseen scenarios. We analyze the separation and generalization abilities of feedforward, long short-term memory, and differentiable neural computer networks. We introduce a new method to evaluate the generalization ability by estimating the VC-dimension of networks with a final linear readout layer. We validate that the VC estimates are good predictors of actual test performance. The reported method can be applied to deep learning problems beyond robotics

arXiv.org e-Print Archive

eScholarship - University of California